[CI] Enable linux-aarch64-a2 (64GB) and tp2 * 2 max-parallel to speed up CI #2065

Potabk · 2025-07-28T09:25:58Z

What this PR does / why we need it?

Currently our workflow run time takes about 3 hours in total, which seriously affects the developer experience, so it is urgent to have a optimization, after this pr, It is expected that the running time of the full CI can be shortened to 1h40min.

Enable linux-aarch64-a2 (64GB) to replace linux-arm64-npu (32GB)
Change TP4 ---> TP2 * 2 max-parallel
Move DeepSeek-V2-Lite-W8A8 to single card test

Does this PR introduce any user-facing change?

No

How was this patch tested?

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@a248025

Signed-off-by: wangli <wangli858794774@gmail.com>

codecov · 2025-07-28T11:03:24Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.83%. Comparing base (935e9d4) to head (4f04bfd).
⚠️ Report is 616 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2065      +/-   ##
==========================================
- Coverage   73.85%   73.83%   -0.03%     
==========================================
  Files         103       96       -7     
  Lines       11425    10865     -560     
==========================================
- Hits         8438     8022     -416     
+ Misses       2987     2843     -144

Flag	Coverage Δ
unittests	`73.83% <ø> (-0.03%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: wangli <wangli858794774@gmail.com>

Yikun · 2025-07-29T07:55:42Z

cc @wangxiyuan @ganyi1996ppo @jianzs @ApsarasX @zzzzwwjj @yiz-liu @whx-sjtu @Angazenn @mengwei805

FYI, after this PR we will use A2 (64GB) in CI

Yikun · 2025-07-29T08:08:45Z

https://github.com/vllm-project/vllm-ascend/blob/main/benchmarks/scripts/run_accuracy.py
https://github.com/vllm-project/vllm-ascend/blob/main/tests/e2e/multicard/test_offline_inference_distributed.py

Seems also need change?

And many func should be added on yaml, this can be done in new PR

def test_models_distributed_pangu():
def test_models_distributed_topk() -> None:
def test_models_distributed_Qwen3_W8A8():

Should be included

Signed-off-by: wangli <wangli858794774@gmail.com>

… up CI (vllm-project#2065) ### What this PR does / why we need it? Currently our workflow run time takes about 3 hours in total, which seriously affects the developer experience, so it is urgent to have a optimization, after this pr, It is expected that the running time of the full CI can be shortened to 1h40min. - Enable linux-aarch64-a2 (64GB) to replace linux-arm64-npu (32GB) - Change TP4 ---> TP2 * 2 max-parallel - Move DeepSeek-V2-Lite-W8A8 to single card test ### Does this PR introduce _any_ user-facing change? No - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@a248025 --------- Signed-off-by: wangli <wangli858794774@gmail.com>

… up CI (vllm-project#2065) ### What this PR does / why we need it? Currently our workflow run time takes about 3 hours in total, which seriously affects the developer experience, so it is urgent to have a optimization, after this pr, It is expected that the running time of the full CI can be shortened to 1h40min. - Enable linux-aarch64-a2 (64GB) to replace linux-arm64-npu (32GB) - Change TP4 ---> TP2 * 2 max-parallel - Move DeepSeek-V2-Lite-W8A8 to single card test ### Does this PR introduce _any_ user-facing change? No - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@a248025 --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

### What this PR does / why we need it? Switch Infra to linux-aarch64-a2 and python to 3.11 Soft backport: #2065 Soft backport: #2072 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed search all: `linux-arm64-npu` and `3.10` --------- Signed-off-by: Yikun Jiang <yikunkero@gmail.com>

… up CI (vllm-project#2065) ### What this PR does / why we need it? Currently our workflow run time takes about 3 hours in total, which seriously affects the developer experience, so it is urgent to have a optimization, after this pr, It is expected that the running time of the full CI can be shortened to 1h40min. - Enable linux-aarch64-a2 (64GB) to replace linux-arm64-npu (32GB) - Change TP4 ---> TP2 * 2 max-parallel - Move DeepSeek-V2-Lite-W8A8 to single card test ### Does this PR introduce _any_ user-facing change? No - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@a248025 --------- Signed-off-by: wangli <wangli858794774@gmail.com>

Potabk added 8 commits July 28, 2025 17:52

upgrade

d168266

Signed-off-by: wangli <wangli858794774@gmail.com>

repair test case

cdddd37

Signed-off-by: wangli <wangli858794774@gmail.com>

fix name

4d6f4d8

Signed-off-by: wangli <wangli858794774@gmail.com>

repair

cdc275a

Signed-off-by: wangli <wangli858794774@gmail.com>

fix

794158a

Signed-off-by: wangli <wangli858794774@gmail.com>

fix long term

4f04388

Signed-off-by: wangli <wangli858794774@gmail.com>

max parallel limit

7aaa0bf

Signed-off-by: wangli <wangli858794774@gmail.com>

repair e2e test yaml

4d1ee72

Signed-off-by: wangli <wangli858794774@gmail.com>

Potabk force-pushed the ci_opt branch from 8a34769 to 4d1ee72 Compare July 28, 2025 09:57

github-actions bot added the module:tests label Jul 28, 2025

fix

974b950

Signed-off-by: wangli <wangli858794774@gmail.com>

remove pin

caa7b53

Signed-off-by: wangli <wangli858794774@gmail.com>

Yikun changed the title ~~[CI] Speed up CI~~ [CI] Enable linux-aarch64-a2 (64GB) and change tp4 --> tp2 * 2 max-parallel to speed up CI Jul 29, 2025

Yikun changed the title ~~[CI] Enable linux-aarch64-a2 (64GB) and change tp4 --> tp2 * 2 max-parallel to speed up CI~~ [CI] Enable linux-aarch64-a2 (64GB) and tp2 * 2 max-parallel to speed up CI Jul 29, 2025

Yikun approved these changes Jul 29, 2025

View reviewed changes

fix tp

4f04bfd

Signed-off-by: wangli <wangli858794774@gmail.com>

Yikun added accuracy-test enable all accuracy test for PR ready-for-test start test by label for PR labels Jul 29, 2025

wangxiyuan merged commit f60bb47 into vllm-project:main Jul 29, 2025
32 checks passed

Potabk deleted the ci_opt branch July 29, 2025 11:34

Yikun mentioned this pull request Jul 31, 2025

[v0.9.1] Switch Infra to linux-aarch64-a2 and python to 3.11 #2119

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI] Enable linux-aarch64-a2 (64GB) and tp2 * 2 max-parallel to speed up CI #2065

[CI] Enable linux-aarch64-a2 (64GB) and tp2 * 2 max-parallel to speed up CI #2065

Uh oh!

Potabk commented Jul 28, 2025 •

edited by github-actions bot

Loading

Uh oh!

codecov bot commented Jul 28, 2025 •

edited

Loading

Uh oh!

Yikun commented Jul 29, 2025

Uh oh!

Yikun commented Jul 29, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[CI] Enable linux-aarch64-a2 (64GB) and tp2 * 2 max-parallel to speed up CI #2065

[CI] Enable linux-aarch64-a2 (64GB) and tp2 * 2 max-parallel to speed up CI #2065

Uh oh!

Conversation

Potabk commented Jul 28, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

codecov bot commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Yikun commented Jul 29, 2025

Uh oh!

Yikun commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Potabk commented Jul 28, 2025 •

edited by github-actions bot

Loading

codecov bot commented Jul 28, 2025 •

edited

Loading

Yikun commented Jul 29, 2025 •

edited

Loading